Arabic Text Categorization using Machine Learning Approaches

نویسنده

  • Riyad Alshammari
چکیده

Arabic Text categorization is considered one of the severe problems in classification using machine learning algorithms. Achieving high accuracy in Arabic text categorization depends on the preprocessing techniques used to prepare the data set. Thus, in this paper, an investigation of the impact of the preprocessing methods concerning the performance of three machine learning algorithms, namely, Naı̈ve Bayesian, DMNBtext and C4.5 is conducted. Results show that the DMNBtext learning algorithm achieved higher performance compared to other machine learning algorithms in categorizing Arabic text. Keywords—Arabic text; categorization; machine learning

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media

Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...

متن کامل

Arabic Text Categorization Using Classification Rule Mining

Text categorization is one of the known problems in classification data mining. It aims to mapping text documents into one or more predefined class or category based on its contents of keywords. This problem has recently attracted many scholars in the data mining and machine learning communities since the numbers of online documents that hold useful information for decision makers, are numerous...

متن کامل

Automated Arabic Text Categorization Using SVM and NB

Text classification is a supervised learning technique that uses labeled training data to derive a classification system (classifier) and then automatically classifies unlabelled text data using the derived classifier. In this paper, we investigate Naïve Bayesian method (NB) and Support Vector Machine algorithm (SVM) on different Arabic data sets. The bases of our comparison are the most popula...

متن کامل

An Intelligent System Based on Statistical Learning For Searching in Arabic Text

In this paper, a novel Arabic text categorization system has been developed based on statistical learning. The system uses a new method for feature extraction. The system has been implemented and tested using an Arabic text corpus. Results prove that the efficiency of the proposed system in text categorization of Arabic documents. Moreover, the system proved powerfulness in grasping the semanti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018